DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 52

1
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization ...
Yan, Brian; Zhang, Chunlei; Yu, Meng. - : arXiv, 2021
BASE
Show details
2
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation ...
BASE
Show details
3
Self-Guided Curriculum Learning for Neural Machine Translation ...
Zhou, Lei; Ding, Liang; Duh, Kevin. - : arXiv, 2021
BASE
Show details
4
Arabic Speech Recognition by End-to-End, Modular Systems and Human ...
BASE
Show details
5
Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yoloxóchitl Mixtec ...
BASE
Show details
6
On Prosody Modeling for ASR+TTS based Voice Conversion ...
Abstract: In voice conversion (VC), an approach showing promising results in the latest voice conversion challenge (VCC) 2020 is to first use an automatic speech recognition (ASR) model to transcribe the source speech into the underlying linguistic contents; these are then used as input by a text-to-speech (TTS) system to generate the converted speech. Such a paradigm, referred to as ASR+TTS, overlooks the modeling of prosody, which plays an important role in speech naturalness and conversion similarity. Although some researchers have considered transferring prosodic clues from the source speech, there arises a speaker mismatch during training and conversion. To address this issue, in this work, we propose to directly predict prosody from the linguistic representation in a target-speaker-dependent manner, referred to as target text prediction (TTP). We evaluate both methods on the VCC2020 benchmark and consider different linguistic representations. The results demonstrate the effectiveness of TTP in both objective and ... : Submitted to ASRU2021. Under review ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://arxiv.org/abs/2107.09477
https://dx.doi.org/10.48550/arxiv.2107.09477
BASE
Hide details
7
Leveraging Pre-trained Language Model for Speech Sentiment Analysis ...
Shon, Suwon; Brusco, Pablo; Pan, Jing. - : arXiv, 2021
BASE
Show details
8
End-to-end ASR to jointly predict transcriptions and linguistic annotations ...
NAACL 2021 2021; Fujita, Yuya; Omachi, Motoi. - : Underline Science Inc., 2021
BASE
Show details
9
Differentiable Allophone Graphs for Language-Universal Speech Recognition ...
BASE
Show details
10
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021 ...
BASE
Show details
11
CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings
In: CHiME 2020 - 6th International Workshop on Speech Processing in Everyday Environments ; https://hal.inria.fr/hal-02546993 ; CHiME 2020 - 6th International Workshop on Speech Processing in Everyday Environments, May 2020, Barcelona / Virtual, Spain (2020)
BASE
Show details
12
Learning Speaker Embedding from Text-to-Speech ...
BASE
Show details
13
Massively Multilingual Adversarial Speech Recognition ...
BASE
Show details
14
A Comparative Study on Transformer vs RNN in Speech Applications ...
BASE
Show details
15
Multilingual End-to-End Speech Translation ...
BASE
Show details
16
Towards Online End-to-end Transformer Automatic Speech Recognition ...
BASE
Show details
17
Transformer ASR with Contextual Block Processing ...
BASE
Show details
18
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
In: Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-01744021 ; Interspeech 2018 - 19th Annual Conference of the International Speech Communication Association, Sep 2018, Hyderabad, India (2018)
BASE
Show details
19
Analysis of Multilingual Sequence-to-Sequence speech recognition systems ...
BASE
Show details
20
Language model integration based on memory control for sequence to sequence speech recognition ...
BASE
Show details

Page: 1 2 3

Catalogues
1
0
10
0
0
0
0
Bibliographies
10
0
4
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
33
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern